Before you begin

Notes

A few notes about this script.

If you are running this with the 2022-2023 data make sure you download the whole (OSM_2022-2023 GitHub repository)[https://github.com/ACMElabUvic/OSM_2022-2023] from the ACMElabUvic GitHub. This will ensure you have all the files, data, and proper folder structure you will need to run this code and associated analyses.

Also make sure you open RStudio through the R project (OSM_2022-2023.Rproj) this will automatically set your working directory to the correct place (wherever you saved the repository) and ensure you don’t have to change the file paths for some of the data.

Lastly, if you are looking to adapt this code for a future year of data, you will want to ensure you have run the ACME_camera_script_9-2-2024.R or .Rmd with your data as there is much data formatting, cleaning, and restructuring that has to be done before this code will work.

If you have question please email the most recent author, currently

Marissa A. Dyck
Postdoctoral research fellow
University of Victoria
School of Environmental Studies
Email: marissadyck17@gmail.com

Install packages

If you don’t already have the following packages installed, use the code below to install them.

install.packages('tidyverse') 
install.packages('ggpubr')
install.packages('corrplot')
install.packages('Hmisc')
install.packages('glmmTMB')
install.packages('MuMIn')

Load libraries

Then load the packages to your library.

library(tidyverse) # data tidying, visualization, and much more; this will load all tidyverse packages, can see complete list using tidyverse_packages()
library(ggpubr) # make modificaions to plot for publication (arrange plots)
library(PerformanceAnalytics)    #Used to generate a correlation plot
## Loading required package: xts
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## 
## ################################### WARNING ###################################
## # We noticed you have dplyr installed. The dplyr lag() function breaks how    #
## # base R's lag() function is supposed to work, which breaks lag(my_xts).      #
## #                                                                             #
## # Calls to lag(my_xts) that you enter or source() into this session won't     #
## # work correctly.                                                             #
## #                                                                             #
## # All package code is unaffected because it is protected by the R namespace   #
## # mechanism.                                                                  #
## #                                                                             #
## # Set `options(xts.warn_dplyr_breaks_lag = FALSE)` to suppress this warning.  #
## #                                                                             #
## # You can use stats::lag() to make sure you're not using dplyr::lag(), or you #
## # can add conflictRules('dplyr', exclude = 'lag') to your .Rprofile to stop   #
## # dplyr from breaking base R's lag() function.                                #
## ################################### WARNING ###################################
## 
## Attaching package: 'xts'
## The following objects are masked from 'package:dplyr':
## 
##     first, last
## 
## Attaching package: 'PerformanceAnalytics'
## The following object is masked from 'package:graphics':
## 
##     legend
library(Hmisc) # used to generate histograms for all variables in data frame
## Loading required package: lattice
## Loading required package: survival
## Loading required package: Formula
## 
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:dplyr':
## 
##     src, summarize
## The following objects are masked from 'package:base':
## 
##     format.pval, units
library(glmmTMB)      #Constructing GLMMs
## Warning in checkMatrixPackageVersion(): Package version inconsistency detected.
## TMB was built with Matrix version 1.4.1
## Current Matrix version is 1.5.3
## Please re-install 'TMB' from source using install.packages('TMB', type = 'source') or ask CRAN for a binary version of 'TMB' matching CRAN's 'Matrix' package
## Warning in checkDepPackageVersion(dep_pkg = "TMB"): Package version inconsistency detected.
## glmmTMB was built with TMB version 1.9.6
## Current TMB version is 1.9.1
## Please re-install glmmTMB from source or restore original 'TMB' package (see '?reinstalling' for more information)
library(MuMIn) # for model selection

Data

Load detection data

Read in saved and cleaned detection data from the ACME_camera_script_9-2-2024.R.

# detection data
# read in saved and cleaned detection data from the ACME_camera_script_9-2-2024.R 
detections <- read_csv('data/processed/OSM_2022_ind_det.csv') %>% 
  
  # change site, species and event_id to factor
  mutate_if(is.character,
            as.factor)
## Rows: 14102 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (4): array, site, species, event_id
## dbl  (3): month, year, timediff
## dttm (1): datetime
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Data formatting

In order to get plots that have the same formatting as last years’ report we have to do a bit of data formatting. First we need to make sure we are including the same relevant species (some were ignored for last years’ report or grouped together)

Last years report had the following species

  • white-tailed deer
  • snowshoe hare
  • black bear
  • coyote
  • red squirrel
  • fisher
  • unknown
  • moose
  • lynx
  • spruce grouse
  • red fox
  • striped skunk
  • ruffed grouse
  • owl
  • grey wolf
  • domestic dog
  • cougar
  • raven
  • other
  • mule deer

And they grouped all humans except for staff as ‘Humans’. Let’s look at the species we have in this year’s data and try to format it the same way

detections %>% 
  
  # group by array and species
  group_by(array, species) %>% 
  summarise(n = n()) %>% 
  
  # have R print everything
  print(n = nrow(.))
## `summarise()` has grouped output by 'array'. You can override using the
## `.groups` argument.
## # A tibble: 119 × 3
## # Groups:   array [4]
##     array species                 n
##     <fct> <fct>               <int>
##   1 LU01  Beaver                  1
##   2 LU01  Black bear            380
##   3 LU01  Cougar                  7
##   4 LU01  Coyote                581
##   5 LU01  Domestic dog            6
##   6 LU01  Fisher                111
##   7 LU01  Grey jay               14
##   8 LU01  Grey wolf              21
##   9 LU01  Human                   3
##  10 LU01  Lynx                   55
##  11 LU01  Moose                  99
##  12 LU01  Other                   1
##  13 LU01  Other birds            60
##  14 LU01  Otter                   2
##  15 LU01  Owl                     2
##  16 LU01  Porcupine               5
##  17 LU01  Raven                   6
##  18 LU01  Red fox                50
##  19 LU01  Red squirrel          879
##  20 LU01  Ruffed grouse          14
##  21 LU01  Short-tailed weasel     5
##  22 LU01  Snowshoe hare        1443
##  23 LU01  Spruce grouse          12
##  24 LU01  Staff                  71
##  25 LU01  Striped skunk          39
##  26 LU01  Unknown               210
##  27 LU01  Unknown canid          48
##  28 LU01  Unknown deer          175
##  29 LU01  Unknown mustelid       13
##  30 LU01  Unknown ungulate        8
##  31 LU01  White-tailed deer    1953
##  32 LU13  ATVer                  31
##  33 LU13  Black bear            275
##  34 LU13  Caribou                 3
##  35 LU13  Coyote                187
##  36 LU13  Fisher                  5
##  37 LU13  Grey jay                2
##  38 LU13  Grey wolf              52
##  39 LU13  Human                   2
##  40 LU13  Hunter                  1
##  41 LU13  Long-tailed weasel      1
##  42 LU13  Lynx                  115
##  43 LU13  Marten                 27
##  44 LU13  Moose                 128
##  45 LU13  Other birds            12
##  46 LU13  Owl                     1
##  47 LU13  Red fox                 2
##  48 LU13  Red squirrel          240
##  49 LU13  Ruffed grouse           7
##  50 LU13  Short-tailed weasel     7
##  51 LU13  Snowshoe hare         573
##  52 LU13  Spruce grouse          25
##  53 LU13  Staff                  82
##  54 LU13  Striped skunk           1
##  55 LU13  Unknown                86
##  56 LU13  Unknown canid          10
##  57 LU13  Unknown deer            5
##  58 LU13  Unknown mustelid        3
##  59 LU13  White-tailed deer      86
##  60 LU13  Wolverine               8
##  61 LU15  ATVer                   1
##  62 LU15  Beaver                  2
##  63 LU15  Black bear            220
##  64 LU15  Canada goose            3
##  65 LU15  Caribou                51
##  66 LU15  Coyote                171
##  67 LU15  Fisher                 25
##  68 LU15  Grey jay               21
##  69 LU15  Grey wolf              61
##  70 LU15  Long-tailed weasel     15
##  71 LU15  Lynx                  122
##  72 LU15  Marten                 63
##  73 LU15  Moose                 157
##  74 LU15  Other birds            59
##  75 LU15  Otter                   5
##  76 LU15  Owl                     1
##  77 LU15  Red fox                39
##  78 LU15  Red squirrel          643
##  79 LU15  Ruffed grouse          11
##  80 LU15  Short-tailed weasel     7
##  81 LU15  Snowmobiler             1
##  82 LU15  Snowshoe hare         611
##  83 LU15  Spruce grouse          21
##  84 LU15  Staff                  78
##  85 LU15  Unknown                98
##  86 LU15  Unknown canid           7
##  87 LU15  Unknown deer           47
##  88 LU15  Unknown mustelid       16
##  89 LU15  Unknown ungulate        5
##  90 LU15  White-tailed deer     429
##  91 LU21  Black bear            544
##  92 LU21  Canada goose            1
##  93 LU21  Caribou                16
##  94 LU21  Cougar                  2
##  95 LU21  Coyote                 51
##  96 LU21  Fisher                 46
##  97 LU21  Grey jay               13
##  98 LU21  Grey wolf              55
##  99 LU21  Long-tailed weasel      1
## 100 LU21  Lynx                   72
## 101 LU21  Marten                 50
## 102 LU21  Moose                 233
## 103 LU21  Other                   1
## 104 LU21  Other birds            44
## 105 LU21  Owl                     8
## 106 LU21  Red fox                14
## 107 LU21  Red squirrel          219
## 108 LU21  Ruffed grouse          11
## 109 LU21  Short-tailed weasel     2
## 110 LU21  Snowmobiler             6
## 111 LU21  Snowshoe hare         284
## 112 LU21  Spruce grouse          19
## 113 LU21  Staff                  71
## 114 LU21  Unknown               162
## 115 LU21  Unknown canid           5
## 116 LU21  Unknown deer           65
## 117 LU21  Unknown mustelid       23
## 118 LU21  Unknown ungulate        4
## 119 LU21  White-tailed deer     839
# now let's create a new data frame (tibble) to work with for the OSM figure summaries specifically

# I personally would lump all the unknown together and all the birds together but for the sake of consistency with last years' figures we will remove some entries, let's create a vector of entries to drop

species_drop <- c('Staff',
                  'Unknown deer',
                  'Unknown ungulate',
                  'Unknown canid',
                  'Unknown mustelid',
                  'Other birds')

# now we can create the new data frame with some changes consistent w/ choices made for 2021-2022
detections <- detections %>% 
  
  # for summarizing, lets lump all the recreational humans into "Humans"
  mutate(species = recode_factor(species, 
                                 "Snowmobiler" = "Human",
                                 "ATVer" = "Human",
                                 'Hunter' = 'Human')) %>% 
  
  # remove species we don't want to plot
  filter(!species %in% species_drop)

We will also want to subset the data by landscape unit (LU) and generate a new data frame for each LU to use for plotting

# we will also want to create a data frame for each LU to plot individually

# LU1
dets_LU1 <- detections %>% 
  filter(array == 'LU01')

# LU13
dets_LU13 <- detections %>% 
  filter(array == 'LU13')

# LU15
dets_LU15 <- detections %>% 
  filter(array == 'LU15')

# LU21
dets_LU21 <- detections %>% 
  filter(array == 'LU21')

ANDREW HELP

Can you make the above code into a forloop which assigns each new data frame created from subsetting as dets_LUname?

Detection plots

Detection data

Now we can apply the same data formatting for each LUs’ data frame using purrr.

We want to count the number of independent detections per species per LU to use in the detection plots

# apply the same formatting to each LU data frame using purrr map
detection_data <- list(dets_LU1,
                       dets_LU13,
                       dets_LU15,
                       dets_LU21) %>% 
  
  purrr::map(
    ~.x %>% 
      
      # group by species
      group_by(species) %>%
      
      # calculate a column with unique accounts of each species
      mutate(count = n_distinct(event_id)) %>% 
      
      # keep just the columns we need
      select(species, count) %>% 
      
      # keep only unique (distinct) rows so we should be left with one row per species, this helps with plotting later if you don't do it ggplot will try to count and plot each row it's annoying
      distinct()) %>% 
  
  # set names of list objects
  purrr::set_names('Detections LU01',
                   'Detections LU13',
                   'Detections LU15',
                   'Detections LU21')

Detection plots

Now to graph independent detections for each LU using purrr, this avoids a TON of code repetition needed to plot each one individually

We use purrr::imap() instead of purrr::map() because imap maintains the variable names in our list (e.g. Detections LU01, Detections LU13, etc.) which we can then use to title each plot.

Within purrr::imap() we just paste the code we would use for a single ggplot since all the graphical elements (except the title which we change with the file name [.y]) are the same

# create object detection plots which uses the detection_data list (w/ all 4 LUs)
detection_plots <- detection_data %>% 
  
  # use imap instead of map as it allows us to use .y to paste the list element names as the plot titles later
  purrr::imap(
    ~.x %>% 
      
      # now just copy and paste the ggplot code for the detection graphs
      ggplot(.,
             aes(x = reorder(species, count), y = count)) +
      
      # plot as bar graph using geom_col so we don't have to provide a y aesthetic
      geom_col() +
      
      # switch the x and y axis
      coord_flip() +
      
      # add the number of detections at the end of each bar
      geom_text(aes(label = count),
                color = "black",
                size = 3,
                hjust = -0.3,
                vjust = 0.2) +
      
      # label x and y axis with informative titles
      labs(x = 'Species',
           y = 'Number of Independent (30 min) Detections') +
      
      # add title to plot with LU name the .y will take the name of whatever you named each list element in the detection_data list, so make sure this name is what you want on the ggtitle
      ggtitle(.y) +
      
      # set the theme
      theme_classic() +
      theme(plot.title = element_text(hjust = 0.5)))

# view plots, this will print each in it's own window so you have to scroll back in the plot viewer pane to look at each one
detection_plots
## $`Detections LU01`

## 
## $`Detections LU13`

## 
## $`Detections LU15`

## 
## $`Detections LU21`

Save detection plots

Now we want to save these plots in case we need each individual one (we will combine the detection and naive occ plots into a single figure for each LU later and use those for the OSM report, but we may want these standalone plots later so let’s save them while they are here).

We can save all the plots from the purrr iteration above using purrr::imap. imap is used instead of map because it allows us to retain the list object names (plot names) to paste as the file name with the .y command.

IMPORTANT if you are using this code for a future github repo, DO NOT use .tiff as the file extension. This will cause issues when trying to push any changes to the github repo as the files are too large to meet githubs requirements

# save plots only use if needed
purrr::imap(
  detection_plots,
  ~ggsave(.x,
             file = paste0("figures/",
                           .y,
                           '.jpg'), # avoid using .tiff extension in the github repo, those files are too large to push to origin
          dpi = 600,
          width = 11,
          height = 9,
          units = 'in'))
## $`Detections LU01`
## [1] "figures/Detections LU01.jpg"
## 
## $`Detections LU13`
## [1] "figures/Detections LU13.jpg"
## 
## $`Detections LU15`
## [1] "figures/Detections LU15.jpg"
## 
## $`Detections LU21`
## [1] "figures/Detections LU21.jpg"

Naive occupancy

Data

We also need to alter the detection data a bit to use for naive occupancy plots.

We will use the individual LU detection data like we did before and use purrr::map() to apply the dame data formatting to all 4 data frames.

Here we want to calculate the total number of sites in each LU, the number of sites each species was detected at in each LU and then use both those numbers to calculate naive occupancy for each species in each LU

# First we need to alter the data frame a bit for these plots, let's create a data frame for each LU (I couldn't figure out how to do this without assigning individual data frames for each UGH)


# apply the same formatting to each data frame using purrr
occupancy_data <- list(dets_LU1,
                       dets_LU13,
                       dets_LU15,
                       dets_LU21) %>% 
  
  purrr::map(
    ~.x %>% 
      
      # calculate the total number of sites for each LU
      mutate(total_sites = n_distinct(site)) %>% 
      
      # group by species to calculate the number of sites each spp occurred at
      group_by(species) %>% 
  
      # add columns to count the number of sites each spp occurred at and then the naive occupancy
  reframe(count = n_distinct(site),
          naive_occ = count/total_sites,
          ind_det = n_distinct(event_id)) %>% 
  
    # keep just the columns we need
  select(species, naive_occ, ind_det) %>% 
  
    # keep only unique (distinct) rows so we should be left with one row per species, this helps with plotting
  distinct()) %>% 
  
  purrr::set_names('Naive Occupancy LU01',
                   'Naive Occupancy LU13',
                   'Naive Occupancy LU15',
                   'Naive Occupancy LU21')

Occupancy plots

Now we can graph naive occupancy for each LU using purrr, and as with the detection plots this saves a massive amount of coding using purrr to run an iteration on the data files and produce four plots at once instead of copying and pasting code for each individually

# create object occupancy_plots which uses the occupancy_data list (w/ all 4 LUs)
occupancy_plots <- occupancy_data %>% 
  
  # use imap instead of map as it allows us to use .y to paste the list element names as the plot titles later
  purrr::imap(
    ~.x %>% 

      # now just copy and paste the ggplot code for the occupancy graphs
      ggplot(.,
             aes(x = fct_reorder(species,
                                 ind_det), # this reorders the species so they match the order of the detection plot which makes it better for viewing when the plots are arranged together in 1 figure for each LU
                 y = naive_occ)) +
      
      # plot as bars using geom_col() which uses stat = 'identity', instead of geom_bar() which will count the rows in each group and plot that instead of naive occ
      geom_col() +
      
      # flip x and y axis 
      coord_flip() +
      
      # add text to end of bars that provides naive occ value
      geom_text(aes(label = round(naive_occ, 2)), 
                size = 3, 
                hjust = -0.3, 
                vjust = 0.2) +
      
      # relabel x and y axis and title
      labs(x = 'Species',
           y = 'Proportion of Sites With At Least One Detection') +
      
      # set plot title using .y (name of list object)
      ggtitle(.y) +
      
      # set. theme elements
      theme_classic()+
      theme(plot.title = element_text(hjust = 0.5)))

# view plots
occupancy_plots
## $`Naive Occupancy LU01`

## 
## $`Naive Occupancy LU13`

## 
## $`Naive Occupancy LU15`

## 
## $`Naive Occupancy LU21`

Save occupancy plots

As with the detection plots, we might want these individual plots later for something so we can use purrr::imap() to save them to the figures folder

Again avoid using the .tiff extension in github

# save plots 
purrr::imap(
  occupancy_plots,
  ~ggsave(.x,
          file = paste0("figures/",
                        .y,
                        '.jpg'), # avoid using .tiff extension in the github repo, those files are too large to push to origin
          dpi = 600,
          width = 11,
          height = 9,
          units = 'in'))
## $`Naive Occupancy LU01`
## [1] "figures/Naive Occupancy LU01.jpg"
## 
## $`Naive Occupancy LU13`
## [1] "figures/Naive Occupancy LU13.jpg"
## 
## $`Naive Occupancy LU15`
## [1] "figures/Naive Occupancy LU15.jpg"
## 
## $`Naive Occupancy LU21`
## [1] "figures/Naive Occupancy LU21.jpg"

Final combined plots for report

The previous year’s report had a figure for each LU with the detections plot on the top and the occupancy plot on the bottom so we will recreate these for this year using ggarrange().

Unfortunately I could not figure out how to do this in purrr to reduce coding but luckily it isn’t too much repitition

# not sure I know how to do the following section in purrr just yet, but we've saved a ton of coding so far and it doesn't take much to arrange each of these individually

# LU1

# arrange the plots so each LU has a figure with detections on top and naive occ on bottom
LU1_det_occ_plots <- ggarrange(detection_plots$`Detections LU01`, occupancy_plots$`Naive Occupancy LU01`,
                               labels = c("A", "B"),
                               nrow = 2)

# view plot
LU1_det_occ_plots

# LU13

# arrange the plots so each LU has a figure with detections on top and naive occ on bottom
LU13_det_occ_plots <- ggarrange(detection_plots$`Detections LU13`, occupancy_plots$`Naive Occupancy LU13`,
                               labels = c("A", "B"),
                               nrow = 2)

# view plot
LU13_det_occ_plots

# LU15

# arrange the plots so each LU has a figure with detections on top and naive occ on bottom
LU15_det_occ_plots <- ggarrange(detection_plots$`Detections LU15`, occupancy_plots$`Naive Occupancy LU15`,
                                labels = c("A", "B"),
                                nrow = 2)

# view plot
LU15_det_occ_plots

# LU21

# arrange the plots so each LU has a figure with detections on top and naive occ on bottom
LU21_det_occ_plots <- ggarrange(detection_plots$`Detections LU21`, occupancy_plots$`Naive Occupancy LU21`,
                                labels = c("A", "B"),
                                nrow = 2)

# view plot
LU21_det_occ_plots

We can however, save all the figures again using purrr

# save all figures at once using purrr
final_det_occ_plots <- list(LU1_det_occ_plots,
                            LU13_det_occ_plots,
                            LU15_det_occ_plots,
                            LU21_det_occ_plots) %>% 
  

  purrr::set_names('LU01_det_occ_plots',
                   'LU13_det_occ_plots',
                   'LU15_det_occ_plots',
                   'LU21_det_occ_plots') %>% 
  
  purrr::imap(
    ~ggsave(.x,
            file = paste0("figures/",
                          .y,
                          '.jpg'), # avoid using .tiff extension in the github repo, those files are too large to push to origin
            dpi = 600,
            width = 12,
            height = 15,
            units = 'in'))

Analysis

Read in data

We need the proportional binomial data and the covariate data (from the ACME_camera_script_9-2-2024.R or .Rmd), let’s read those in now and check the structure of each

# response metric (proportional detections from the from the ACME_camera_script_9-2-2024.R or .Rmd)
prop_detections <- read_csv('data/processed/OSM_2022_proportional_detections.csv')
## Rows: 152 Columns: 23
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): site
## dbl (22): black_bear, coyote, fisher, moose, white-tailed_deer, cougar, grey...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# check variable structure
str(prop_detections)
## spc_tbl_ [152 × 23] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ site                    : chr [1:152] "LU01_06" "LU01_10" "LU01_11" "LU01_13" ...
##  $ black_bear              : num [1:152] 7 3 4 7 8 9 4 5 7 7 ...
##  $ coyote                  : num [1:152] 4 4 8 10 11 9 11 0 9 4 ...
##  $ fisher                  : num [1:152] 5 3 3 3 2 1 1 1 0 3 ...
##  $ moose                   : num [1:152] 3 2 5 9 1 0 2 4 1 0 ...
##  $ white-tailed_deer       : num [1:152] 12 5 12 12 13 14 15 9 12 10 ...
##  $ cougar                  : num [1:152] 0 0 1 0 1 0 0 0 0 0 ...
##  $ grey_wolf               : num [1:152] 0 0 2 0 0 0 1 0 0 0 ...
##  $ lynx                    : num [1:152] 0 0 1 0 1 1 0 0 0 2 ...
##  $ red_fox                 : num [1:152] 0 0 2 0 0 0 0 0 4 0 ...
##  $ wolverine               : num [1:152] 0 0 0 0 0 0 0 0 0 0 ...
##  $ caribou                 : num [1:152] 0 0 0 0 0 0 0 0 0 0 ...
##  $ absent_black_bear       : num [1:152] 5 3 8 5 4 3 8 7 5 5 ...
##  $ absent_coyote           : num [1:152] 10 1 6 5 3 5 4 15 6 11 ...
##  $ absent_fisher           : num [1:152] 9 2 11 12 12 13 14 14 15 12 ...
##  $ absent_moose            : num [1:152] 11 3 9 6 13 14 13 11 14 15 ...
##  $ absent_white-tailed_deer: num [1:152] 2 0 2 3 1 0 0 6 3 5 ...
##  $ absent_cougar           : num [1:152] 14 5 13 15 13 14 15 15 15 15 ...
##  $ absent_grey_wolf        : num [1:152] 14 5 12 15 14 14 14 15 15 15 ...
##  $ absent_lynx             : num [1:152] 14 5 13 15 13 13 15 15 15 13 ...
##  $ absent_red_fox          : num [1:152] 14 5 12 15 14 14 15 15 11 15 ...
##  $ absent_wolverine        : num [1:152] 14 5 14 15 14 14 15 15 15 15 ...
##  $ absent_caribou          : num [1:152] 14 5 14 15 14 14 15 15 15 15 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   site = col_character(),
##   ..   black_bear = col_double(),
##   ..   coyote = col_double(),
##   ..   fisher = col_double(),
##   ..   moose = col_double(),
##   ..   `white-tailed_deer` = col_double(),
##   ..   cougar = col_double(),
##   ..   grey_wolf = col_double(),
##   ..   lynx = col_double(),
##   ..   red_fox = col_double(),
##   ..   wolverine = col_double(),
##   ..   caribou = col_double(),
##   ..   absent_black_bear = col_double(),
##   ..   absent_coyote = col_double(),
##   ..   absent_fisher = col_double(),
##   ..   absent_moose = col_double(),
##   ..   `absent_white-tailed_deer` = col_double(),
##   ..   absent_cougar = col_double(),
##   ..   absent_grey_wolf = col_double(),
##   ..   absent_lynx = col_double(),
##   ..   absent_red_fox = col_double(),
##   ..   absent_wolverine = col_double(),
##   ..   absent_caribou = col_double()
##   .. )
##  - attr(*, "problems")=<externalptr>
# model covariates (merged HFI and VEG data from the ACME_camera_script_9-2-2024.R or .Rmd)
covariates <- read_csv('data/processed/OSM_2022_covariates.csv',
                       
                       # set the column types to read in correctly
                       col_types = cols(array = col_factor(),
                                        camera = col_factor(),
                                        site = col_factor(),
                                        buff_dist = col_factor(),
                                        .default = col_number()))

# check variable structure
str(covariates)
## spc_tbl_ [3,100 × 76] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ array                       : Factor w/ 4 levels "LU13","LU15",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ camera                      : Factor w/ 96 levels "18","15","03",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ site                        : Factor w/ 155 levels "LU13_18","LU13_15",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ buff_dist                   : Factor w/ 20 levels "250","500","750",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ harvest_area                : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ crop                        : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ well_aband                  : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ well_oil                    : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ trail                       : num [1:3100] 0 0 NA 0.5 0 ...
##  $ harvest_area_white_zone     : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ conventional_seismic        : num [1:3100] 0.5 0.5 NA 0.5 1 ...
##  $ pipeline                    : num [1:3100] 0 0.5 NA 0 0 0 0.5 0 0 0 ...
##  $ tame_pasture                : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ rough_pasture               : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ rural_residence             : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ transmission_line           : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ well_gas                    : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ misc_oil_gas_facility       : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ clearing_unknown            : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ vegetated_edge_roads        : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ road_unimproved             : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ road_gravel_1l              : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ road_gravel_2l              : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ truck_trail                 : num [1:3100] 0.5 0 NA 0 0 0 0 0 0 0 ...
##  $ borrowpits                  : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ sump                        : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ borrowpit_wet               : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ cultivation_abandoned       : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ urban_residence             : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ country_residence           : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ recreation                  : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ well_other                  : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ well_bitumen                : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ well_cased                  : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ road_paved_undiv_2l         : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ road_unclassified           : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ runway                      : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ clearing_wellpad_unconfirmed: num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ facility_unknown            : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ borrowpit_dry               : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ grvl_sand_pit               : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ dugout                      : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ lagoon                      : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ open_pit_mine               : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ low_impact_seismic          : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ surrounding_veg             : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ transfer_station            : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ facility_other              : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ vegetated_edge_railways     : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ fruit_vegetables            : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ residence_clearing          : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ cfo                         : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ landfill                    : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ well_cleared_not_confirmed  : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ oil_gas_plant               : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ urban_industrial            : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ road_paved_1l               : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ road_paved_undiv_1l         : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ road_winter                 : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ well_cleared_not_drilled    : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ well_unknown                : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ airp_runway                 : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ reservoir                   : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ campground                  : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ canal                       : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ camp_industrial             : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ rlwy_sgl_track              : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ lc_class20                  : num [1:3100] 0 0 0.4 0.2 0.5 0 0 0 0 0 ...
##  $ lc_class33                  : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ lc_class34                  : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ lc_class50                  : num [1:3100] 0 0 0.2 0.2 0 0 0 0 0 0 ...
##  $ lc_class110                 : num [1:3100] 0 0.167 0 0 0 ...
##  $ lc_class120                 : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ lc_class210                 : num [1:3100] 0.5 0.333 0.2 0.4 0.5 ...
##  $ lc_class220                 : num [1:3100] 0.5 0.333 0.2 0.2 0 ...
##  $ lc_class230                 : num [1:3100] 0 0.167 0 0 0 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   .default = col_number(),
##   ..   array = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   ..   camera = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   ..   site = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   ..   buff_dist = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   ..   harvest_area = col_number(),
##   ..   crop = col_number(),
##   ..   well_aband = col_number(),
##   ..   well_oil = col_number(),
##   ..   trail = col_number(),
##   ..   harvest_area_white_zone = col_number(),
##   ..   conventional_seismic = col_number(),
##   ..   pipeline = col_number(),
##   ..   tame_pasture = col_number(),
##   ..   rough_pasture = col_number(),
##   ..   rural_residence = col_number(),
##   ..   transmission_line = col_number(),
##   ..   well_gas = col_number(),
##   ..   misc_oil_gas_facility = col_number(),
##   ..   clearing_unknown = col_number(),
##   ..   vegetated_edge_roads = col_number(),
##   ..   road_unimproved = col_number(),
##   ..   road_gravel_1l = col_number(),
##   ..   road_gravel_2l = col_number(),
##   ..   truck_trail = col_number(),
##   ..   borrowpits = col_number(),
##   ..   sump = col_number(),
##   ..   borrowpit_wet = col_number(),
##   ..   cultivation_abandoned = col_number(),
##   ..   urban_residence = col_number(),
##   ..   country_residence = col_number(),
##   ..   recreation = col_number(),
##   ..   well_other = col_number(),
##   ..   well_bitumen = col_number(),
##   ..   well_cased = col_number(),
##   ..   road_paved_undiv_2l = col_number(),
##   ..   road_unclassified = col_number(),
##   ..   runway = col_number(),
##   ..   clearing_wellpad_unconfirmed = col_number(),
##   ..   facility_unknown = col_number(),
##   ..   borrowpit_dry = col_number(),
##   ..   grvl_sand_pit = col_number(),
##   ..   dugout = col_number(),
##   ..   lagoon = col_number(),
##   ..   open_pit_mine = col_number(),
##   ..   low_impact_seismic = col_number(),
##   ..   surrounding_veg = col_number(),
##   ..   transfer_station = col_number(),
##   ..   facility_other = col_number(),
##   ..   vegetated_edge_railways = col_number(),
##   ..   fruit_vegetables = col_number(),
##   ..   residence_clearing = col_number(),
##   ..   cfo = col_number(),
##   ..   landfill = col_number(),
##   ..   well_cleared_not_confirmed = col_number(),
##   ..   oil_gas_plant = col_number(),
##   ..   urban_industrial = col_number(),
##   ..   road_paved_1l = col_number(),
##   ..   road_paved_undiv_1l = col_number(),
##   ..   road_winter = col_number(),
##   ..   well_cleared_not_drilled = col_number(),
##   ..   well_unknown = col_number(),
##   ..   airp_runway = col_number(),
##   ..   reservoir = col_number(),
##   ..   campground = col_number(),
##   ..   canal = col_number(),
##   ..   camp_industrial = col_number(),
##   ..   rlwy_sgl_track = col_number(),
##   ..   lc_class20 = col_number(),
##   ..   lc_class33 = col_number(),
##   ..   lc_class34 = col_number(),
##   ..   lc_class50 = col_number(),
##   ..   lc_class110 = col_number(),
##   ..   lc_class120 = col_number(),
##   ..   lc_class210 = col_number(),
##   ..   lc_class220 = col_number(),
##   ..   lc_class230 = col_number()
##   .. )
##  - attr(*, "problems")=<externalptr>

Format covariates

There are too many covariates to include in the models individually and many of them describe similar HFI features. We can use the info from the README file in this repository which includes detailed descriptions from the ABMI human footprints wall to wall data download website for Year 2021 OR in the relevant_literature folder of this repository (HFI_2021_v1_0_Metadata_Final.pdf).

the current version of this code for the purposes of the 2022-2023 report used a merged dataset from 2021-2022 and 2022-2023 data, howver each year of data the variables were extracted slightly differenty from GIS so final version of this code will include a different formatting process which will likely occur in the ACME_camera_script_9-2-2024.R or .Rmd

covariates_grouped <- covariates %>% 
  
  mutate(borrowpits = rowSums(across(contains('borrowpit'))),
         industrial_sites = camp_industrial + oil_gas_plant + open_pit_mine + 
           rowSums(across(contains('facility'))),
         seismic_lines = rowSums(across(contains('seismic'))),
         wellsites = rowSums(across(contains('well'))),
         roads =  rowSums(across(contains('road'))),
         havest_areas = rowSums(across(contains('harvest'))),
         trails = rowSums(across(contains('trail'))),
         residences = rowSums(across(contains('residence'))),
         pasture = rowSums(across(contains('pasture'))),
         other_transportation_features = runway + airp_runway + rlwy_sgl_track + vegetated_edge_railways,
         crops = crop + fruit_vegetables + cultivation_abandoned,
         water = lagoon + reservoir + dugout + canal,
         .keep = 'unused') %>% 
  
  # remove features we don't need
  select(!c(recreation,
            clearing_unknown,
            cfo,
            grvl_sand_pit,
            transfer_station,
            campground,
            surrounding_veg,
            urban_industrial,
            landfill,
            sump,
            water,
            crops,
            other_transportation_features,
            pasture,
            residences
            )) %>% 
  
  # reorder variables
  relocate(c(pipeline,
             transmission_line,
             borrowpits),
           .after = lc_class230)

# see what's left
names(covariates_grouped)
##  [1] "array"             "camera"            "site"             
##  [4] "buff_dist"         "lc_class20"        "lc_class33"       
##  [7] "lc_class34"        "lc_class50"        "lc_class110"      
## [10] "lc_class120"       "lc_class210"       "lc_class220"      
## [13] "lc_class230"       "pipeline"          "transmission_line"
## [16] "borrowpits"        "industrial_sites"  "seismic_lines"    
## [19] "wellsites"         "roads"             "havest_areas"     
## [22] "trails"
# check the structure of new data
str(covariates_grouped)
## tibble [3,100 × 22] (S3: tbl_df/tbl/data.frame)
##  $ array            : Factor w/ 4 levels "LU13","LU15",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ camera           : Factor w/ 96 levels "18","15","03",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ site             : Factor w/ 155 levels "LU13_18","LU13_15",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ buff_dist        : Factor w/ 20 levels "250","500","750",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ lc_class20       : num [1:3100] 0 0 0.4 0.2 0.5 0 0 0 0 0 ...
##  $ lc_class33       : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ lc_class34       : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ lc_class50       : num [1:3100] 0 0 0.2 0.2 0 0 0 0 0 0 ...
##  $ lc_class110      : num [1:3100] 0 0.167 0 0 0 ...
##  $ lc_class120      : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ lc_class210      : num [1:3100] 0.5 0.333 0.2 0.4 0.5 ...
##  $ lc_class220      : num [1:3100] 0.5 0.333 0.2 0.2 0 ...
##  $ lc_class230      : num [1:3100] 0 0.167 0 0 0 ...
##  $ pipeline         : num [1:3100] 0 0.5 NA 0 0 0 0.5 0 0 0 ...
##  $ transmission_line: num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ borrowpits       : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ industrial_sites : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ seismic_lines    : num [1:3100] 0.5 0.5 NA 0.5 1 ...
##  $ wellsites        : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ roads            : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ havest_areas     : num [1:3100] 0 0 NA 0 0 0 0 0 0 0 ...
##  $ trails           : num [1:3100] 0.5 0 NA 0.5 0 ...
# check summary of new data
summary(covariates_grouped)
##   array         camera          site        buff_dist      lc_class20     
##  LU13:820   27     :  80   LU13_18:  20   250    : 155   Min.   :0.00000  
##  LU15:780   32     :  80   LU13_15:  20   500    : 155   1st Qu.:0.00000  
##  LU21:720   41     :  80   LU13_03:  20   750    : 155   Median :0.05460  
##  LU01:780   36     :  80   LU13_34:  20   1000   : 155   Mean   :0.07304  
##             16     :  60   LU13_57:  20   1250   : 155   3rd Qu.:0.11321  
##             21     :  60   LU13_16:  20   1500   : 155   Max.   :0.59091  
##             (Other):2660   (Other):2980   (Other):2170                    
##    lc_class33          lc_class34        lc_class50      lc_class110     
##  Min.   :0.0000000   Min.   :0.00000   Min.   :0.0000   Min.   :0.00000  
##  1st Qu.:0.0000000   1st Qu.:0.00000   1st Qu.:0.1000   1st Qu.:0.05882  
##  Median :0.0000000   Median :0.00000   Median :0.1686   Median :0.12500  
##  Mean   :0.0005196   Mean   :0.01445   Mean   :0.1829   Mean   :0.12674  
##  3rd Qu.:0.0000000   3rd Qu.:0.01316   3rd Qu.:0.2500   3rd Qu.:0.17550  
##  Max.   :0.0909091   Max.   :0.33333   Max.   :1.0000   Max.   :0.55556  
##                                                                          
##   lc_class120       lc_class210      lc_class220      lc_class230     
##  Min.   :0.00000   Min.   :0.0000   Min.   :0.0000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.1064   1st Qu.:0.1726   1st Qu.:0.08217  
##  Median :0.00000   Median :0.1591   Median :0.2222   Median :0.15152  
##  Mean   :0.03835   Mean   :0.1813   Mean   :0.2268   Mean   :0.15583  
##  3rd Qu.:0.00000   3rd Qu.:0.2179   3rd Qu.:0.2735   3rd Qu.:0.22093  
##  Max.   :1.00000   Max.   :1.0000   Max.   :1.0000   Max.   :0.66667  
##                                                                       
##     pipeline       transmission_line    borrowpits      industrial_sites  
##  Min.   :0.00000   Min.   :0.000000   Min.   :0.00000   Min.   :0.000000  
##  1st Qu.:0.00000   1st Qu.:0.000000   1st Qu.:0.00000   1st Qu.:0.000000  
##  Median :0.03637   Median :0.000000   Median :0.00000   Median :0.000000  
##  Mean   :0.06940   Mean   :0.004712   Mean   :0.01108   Mean   :0.001448  
##  3rd Qu.:0.10638   3rd Qu.:0.000000   3rd Qu.:0.01807   3rd Qu.:0.000000  
##  Max.   :1.00000   Max.   :0.500000   Max.   :0.16667   Max.   :0.111111  
##  NA's   :8         NA's   :8          NA's   :8         NA's   :8         
##  seismic_lines      wellsites           roads          havest_areas    
##  Min.   :0.0000   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.2774   1st Qu.:0.01541   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.3868   Median :0.04408   Median :0.05939   Median :0.00000  
##  Mean   :0.4173   Mean   :0.05748   Mean   :0.15189   Mean   :0.04801  
##  3rd Qu.:0.5000   3rd Qu.:0.08125   3rd Qu.:0.27978   3rd Qu.:0.04506  
##  Max.   :1.0000   Max.   :0.50000   Max.   :0.83333   Max.   :0.83333  
##  NA's   :8        NA's   :8         NA's   :8         NA's   :8        
##      trails      
##  Min.   :0.0000  
##  1st Qu.:0.0617  
##  Median :0.1522  
##  Mean   :0.1874  
##  3rd Qu.:0.2712  
##  Max.   :1.0000  
##  NA's   :8
# there are some NAs in the data which will cause problems with modeling/visualization of data ignore for now but will explore these sites specifically after report

covariates_grouped <- covariates_grouped %>% 
  
  # remove rows with NAs
  na.omit()

Subset data & correlation plots

Marissa try to get the purrr code for this to work later

Now we need to subset the data for each buffer width, and then in the same loop let’s make correlation plots for these variables within each buffer

# Couldn't get this to work in purrr yet so using a loop to subset the data, create the plots, and save them all in one section... NEAT

buffer_frames<-list()

for (i in unique(covariates_grouped$buff_dist)){
  
  print(i)
  
  #Subset data based on radius
  df<-covariates_grouped%>%
    filter(buff_dist == i)
  
  #rename dataframe on the fly
  assign(paste("df", i, sep ="_"), df)
  
  #list of dataframes
  buffer_frames<-c(buffer_frames, list(df))
  
  #Subset data based on radius
  df<-covariates_grouped%>%
    filter(buff_dist == i)%>%
    select(where(is.numeric))
  
 #compute a correlation matrix (watch for errors)
 matrix<-cor(df)
 
 #print and save the correlation plot on the go
 #renaming for each buffer as we do
 png(file.path("figures/", paste("correlation_", i, ".png")))
 corrplot::corrplot(matrix,
                    type = 'upper',
                    tl.col = 'black',
                    title = paste0('Variable correlation plot at ', i))
 dev.off()
  
}
## [1] "250"
## Warning in cor(df): the standard deviation is zero
## [1] "500"
## Warning in cor(df): the standard deviation is zero
## [1] "750"
## [1] "1000"
## [1] "1250"
## [1] "1500"
## [1] "1750"
## [1] "2000"
## [1] "2250"
## [1] "2500"
## [1] "2750"
## [1] "3000"
## [1] "3250"
## [1] "3500"
## [1] "3750"
## [1] "4000"
## [1] "4250"
## [1] "4500"
## [1] "4750"
## [1] "5000"
# name list objects so we can extract names for plotting 

buffer_frames <- buffer_frames %>% 
  
  # absurdly long way to do this but for sake of time fuck it
  purrr::set_names('250 meter buffer',
                   '500 meter buffer',
                   '750 meter buffer',
                   '1000 meter buffer',
                   '1250 meter buffer',
                   '1500 meter buffer',
                   '1750 meter buffer',
                   '2000 meter buffer',
                   '2250 meter buffer',
                   '2500 meter buffer',
                   '2750 meter buffer',
                   '3000 meter buffer',
                   '3250 meter buffer',
                   '3500 meter buffer',
                   '3750 meter buffer',
                   '4000 meter buffer',
                   '4250 meter buffer',
                   '4500 meter buffer',
                   '4750 meter buffer',
                   '5000 meter buffer')

Exploratory plots

add more to this section in later when we have more time to explore the covariates and choose which should be inlcuded etc.

hfi_histograms <- buffer_frames %>% 
  
  purrr::imap(
    ~.x %>% 
      
      # filter to just the HFI variables 
      select(where(is.numeric) &
          ! starts_with('lc_class')) %>% 
      
      # pipe into hist.data.frame function to make histograms for each variable
      hist.data.frame(mtitl = paste0('Histograms of HFI variables at ', .y)))

Now let’s do the same thing with the landcover variables

lc_histograms <- buffer_frames %>% 
  
  purrr::imap(
    ~.x %>% 
      
      # filter to just the landcover variables 
      select(where(is.numeric) &
          starts_with('lc_class')) %>% 
      
      # pipe into hist.data.frame function to make histograms for each variable
      hist.data.frame(mtitl = paste0('Histograms of landcover variables at ', .y)))

Add response metric

Now that we have the covariate data formatted we need to add the response metric (monthly proportional presence/absence) to the data frames

final_df <- buffer_frames %>% 
  
  purrr::map(
    ~.x %>% 
      
      left_join(prop_detections,
                by = 'site'))

Black bear

ANDREW/MARISSA FIX later

there is probably a way to shorten the following code to select particular species, I saw Andrew’s for loop in the draft script he wrote but couldn’t quite figure it out so I did this instead, maybe we can merge approaches?

Global model

black_bear_mods <- final_df %>%

  purrr::map(
    ~.x %>%

     glmmTMB::glmmTMB(cbind(black_bear, absent_black_bear) ~
                        seismic_lines +
                        pipeline +
                        borrowpits +
                        wellsites +
                        roads +
                        trails + 
                        lc_class20 +
                        lc_class34 +
                        lc_class50 +
                        lc_class110 +
                        lc_class210 +
                        lc_class220 +
                        lc_class230 +
                        (1|array),
                   data = .,
                   family = 'binomial'))

Model selection

model.sel(black_bear_mods)
## Warning in model.sel.default(black_bear_mods): models are not all fitted to the
## same data
## Model selection table 
##                   cnd((Int)) dsp((Int)) cnd(brr) cnd(lc_c11) cnd(lc_c20)
## 250 meter buffer     -0.8237          +   1.0280     0.40780   0.5002000
## 500 meter buffer     -1.8100          +  -8.8800     1.48300   2.1650000
## 1000 meter buffer    -1.2440          +   5.0040    -0.79470   0.9692000
## 750 meter buffer     -0.9882          +   3.2100     0.08085   1.0990000
## 1500 meter buffer    -1.2430          +  -1.6840    -1.97700  -0.1470000
## 4000 meter buffer    -2.5810          + -10.0700     1.06300   2.4720000
## 4250 meter buffer    -3.1640          + -12.3600     0.46190   3.0210000
## 1250 meter buffer    -1.5400          +  -3.2830    -0.87330   0.6002000
## 4500 meter buffer    -3.1770          + -11.5900    -0.12570   2.3210000
## 5000 meter buffer    -2.9500          +  -9.5120     2.01100   3.0470000
## 4750 meter buffer    -2.9020          + -10.2100     1.21200   2.9250000
## 3750 meter buffer    -2.0630          +  -7.5570    -0.28530   1.8680000
## 1750 meter buffer    -1.2670          +  -2.4830    -1.42300   0.1190000
## 3500 meter buffer    -2.1010          +  -7.2870    -0.70810   1.3820000
## 3250 meter buffer    -0.3758          +  -3.9270    -2.37000   0.0878400
## 3000 meter buffer    -0.2031          +  -3.0550    -2.93400  -0.1205000
## 2750 meter buffer     0.2775          +   0.8066    -2.37500  -0.2794000
## 2250 meter buffer    -0.6461          +   2.6130    -1.06400   0.5436000
## 2000 meter buffer    -0.7442          +   0.5012    -1.60200  -0.0001237
## 2500 meter buffer    -0.1102          +   0.9909    -2.07700  -0.2207000
##                   cnd(lc_c21) cnd(lc_c22) cnd(lc_c23) cnd(lc_c34) cnd(lc_c50)
## 250 meter buffer     -0.02139     0.82310    0.334200    -0.01718     0.05003
## 500 meter buffer      1.02300     2.56100    1.662000     2.70000     1.47400
## 1000 meter buffer    -0.99610     0.63190    0.488000    -0.19810     0.25530
## 750 meter buffer     -0.68520     1.29900    0.121600    -1.50100     0.01754
## 1500 meter buffer    -1.13400     1.20900   -0.657200    -2.59900    -0.55680
## 4000 meter buffer    -1.28200     1.08200    3.415000    31.62000     1.35600
## 4250 meter buffer    -1.30100     2.05700    3.583000    26.01000     1.64500
## 1250 meter buffer    -0.56750     1.28500    0.170100     0.45260     0.06635
## 4500 meter buffer    -2.24200     1.93700    3.601000    -1.36700     1.68800
## 5000 meter buffer    -1.26500     0.35690    5.558000    24.91000     2.14500
## 4750 meter buffer    -1.73800     0.94160    4.978000    22.74000     1.84500
## 3750 meter buffer    -1.88700     0.39800    2.642000    11.30000     1.01100
## 1750 meter buffer    -1.02700     1.33500    0.063110     4.37900     0.10100
## 3500 meter buffer    -2.00500     0.60230    2.294000     5.30400     0.86070
## 3250 meter buffer    -2.44600    -0.85350   -0.697700    -4.48900    -0.64980
## 3000 meter buffer    -2.13700    -0.80130   -1.454000    -8.13900    -1.22600
## 2750 meter buffer    -1.95800    -1.51600   -1.743000   -10.23000    -1.66000
## 2250 meter buffer    -1.75100    -0.54820    0.076010     0.62950    -0.64770
## 2000 meter buffer    -1.28200     0.01076   -0.002506     0.43760    -0.64780
## 2500 meter buffer    -1.49300    -1.10300   -1.262000     0.99500    -1.82600
##                   cnd(ppl) cnd(rds) cnd(ssm_lns) cnd(trl) cnd(wll) df   logLik
## 250 meter buffer  -0.58350  0.04315     -0.24170  -0.1009 -1.53600 15 -294.398
## 500 meter buffer  -1.18100 -0.37350     -0.52020  -0.5640  0.49210 15 -313.151
## 1000 meter buffer  0.46020  0.66440      0.71130   0.6546 -1.01000 15 -313.579
## 750 meter buffer  -0.33060  0.39990      0.08933   0.1862 -0.04885 15 -313.611
## 1500 meter buffer  1.03500  0.96360      1.27100   1.6530 -0.36790 15 -314.163
## 4000 meter buffer -0.05392  0.81850      0.80620   1.1670 -0.69150 15 -315.277
## 4250 meter buffer  0.33400  1.50500      1.15300   1.4560  0.31490 15 -315.758
## 1250 meter buffer  0.95440  0.90300      0.84580   1.2300 -0.93390 15 -315.859
## 4500 meter buffer  1.63800  2.26200      1.74200   1.8280 -0.70290 15 -316.302
## 5000 meter buffer -1.26300  1.42800      0.56130   0.5805  0.78170 15 -317.172
## 4750 meter buffer -0.63540  1.47000      0.79400   0.7656  0.83780 15 -317.404
## 3750 meter buffer  0.57060  1.07800      1.07100   1.3880 -0.48320 15 -317.432
## 1750 meter buffer  0.40550  0.03219      0.92360   0.6467  1.10800 15 -317.524
## 3500 meter buffer  0.72160  1.32200      1.38500   1.4750  0.14660 15 -318.383
## 3250 meter buffer  0.18050  0.74830      1.18900   1.1820  1.57800 15 -318.817
## 3000 meter buffer  0.46020  0.92300      1.24900   1.3140  2.00700 15 -319.422
## 2750 meter buffer  0.87090  0.47250      0.88000   0.8917  1.42000 15 -320.792
## 2250 meter buffer  0.99960  0.23030      0.74480   0.8051  0.13300 15 -321.218
## 2000 meter buffer  0.76620  0.27650      0.88490   0.7060  0.53110 15 -321.295
## 2500 meter buffer  0.59280  0.25370      0.97210   1.0170  1.66700 15 -321.567
##                    AICc delta weight
## 250 meter buffer  622.5  0.00      1
## 500 meter buffer  659.8 37.29      0
## 1000 meter buffer 660.7 38.14      0
## 750 meter buffer  660.8 38.21      0
## 1500 meter buffer 661.9 39.31      0
## 4000 meter buffer 664.1 41.54      0
## 4250 meter buffer 665.0 42.50      0
## 1250 meter buffer 665.2 42.70      0
## 4500 meter buffer 666.1 43.59      0
## 5000 meter buffer 667.9 45.33      0
## 4750 meter buffer 668.3 45.79      0
## 3750 meter buffer 668.4 45.85      0
## 1750 meter buffer 668.6 46.03      0
## 3500 meter buffer 670.3 47.75      0
## 3250 meter buffer 671.2 48.62      0
## 3000 meter buffer 672.4 49.83      0
## 2750 meter buffer 675.1 52.57      0
## 2250 meter buffer 676.0 53.42      0
## 2000 meter buffer 676.1 53.58      0
## 2500 meter buffer 676.7 54.12      0
## Models ranked by AICc(x) 
## Random terms (all models): 
##   cond(1 | array)

hmmmm seems fishy to me that the 250 meter buffer which is the only one that had missing data would perform THAT much better than all the others, and really you shouldn’t compare models if they aren’t run on the same data, hence the warning message

Let’s remove the 250 buffer and see what happens

black_bear_mods_no250 <- black_bear_mods  %>% 
  
  purrr::discard_at('250 meter buffer')

# run model selection again
model.sel(black_bear_mods_no250)
## Model selection table 
##                   cnd((Int)) dsp((Int)) cnd(brr) cnd(lc_c11) cnd(lc_c20)
## 500 meter buffer     -1.8100          +  -8.8800     1.48300   2.1650000
## 1000 meter buffer    -1.2440          +   5.0040    -0.79470   0.9692000
## 750 meter buffer     -0.9882          +   3.2100     0.08085   1.0990000
## 1500 meter buffer    -1.2430          +  -1.6840    -1.97700  -0.1470000
## 4000 meter buffer    -2.5810          + -10.0700     1.06300   2.4720000
## 4250 meter buffer    -3.1640          + -12.3600     0.46190   3.0210000
## 1250 meter buffer    -1.5400          +  -3.2830    -0.87330   0.6002000
## 4500 meter buffer    -3.1770          + -11.5900    -0.12570   2.3210000
## 5000 meter buffer    -2.9500          +  -9.5120     2.01100   3.0470000
## 4750 meter buffer    -2.9020          + -10.2100     1.21200   2.9250000
## 3750 meter buffer    -2.0630          +  -7.5570    -0.28530   1.8680000
## 1750 meter buffer    -1.2670          +  -2.4830    -1.42300   0.1190000
## 3500 meter buffer    -2.1010          +  -7.2870    -0.70810   1.3820000
## 3250 meter buffer    -0.3758          +  -3.9270    -2.37000   0.0878400
## 3000 meter buffer    -0.2031          +  -3.0550    -2.93400  -0.1205000
## 2750 meter buffer     0.2775          +   0.8066    -2.37500  -0.2794000
## 2250 meter buffer    -0.6461          +   2.6130    -1.06400   0.5436000
## 2000 meter buffer    -0.7442          +   0.5012    -1.60200  -0.0001237
## 2500 meter buffer    -0.1102          +   0.9909    -2.07700  -0.2207000
##                   cnd(lc_c21) cnd(lc_c22) cnd(lc_c23) cnd(lc_c34) cnd(lc_c50)
## 500 meter buffer       1.0230     2.56100    1.662000      2.7000     1.47400
## 1000 meter buffer     -0.9961     0.63190    0.488000     -0.1981     0.25530
## 750 meter buffer      -0.6852     1.29900    0.121600     -1.5010     0.01754
## 1500 meter buffer     -1.1340     1.20900   -0.657200     -2.5990    -0.55680
## 4000 meter buffer     -1.2820     1.08200    3.415000     31.6200     1.35600
## 4250 meter buffer     -1.3010     2.05700    3.583000     26.0100     1.64500
## 1250 meter buffer     -0.5675     1.28500    0.170100      0.4526     0.06635
## 4500 meter buffer     -2.2420     1.93700    3.601000     -1.3670     1.68800
## 5000 meter buffer     -1.2650     0.35690    5.558000     24.9100     2.14500
## 4750 meter buffer     -1.7380     0.94160    4.978000     22.7400     1.84500
## 3750 meter buffer     -1.8870     0.39800    2.642000     11.3000     1.01100
## 1750 meter buffer     -1.0270     1.33500    0.063110      4.3790     0.10100
## 3500 meter buffer     -2.0050     0.60230    2.294000      5.3040     0.86070
## 3250 meter buffer     -2.4460    -0.85350   -0.697700     -4.4890    -0.64980
## 3000 meter buffer     -2.1370    -0.80130   -1.454000     -8.1390    -1.22600
## 2750 meter buffer     -1.9580    -1.51600   -1.743000    -10.2300    -1.66000
## 2250 meter buffer     -1.7510    -0.54820    0.076010      0.6295    -0.64770
## 2000 meter buffer     -1.2820     0.01076   -0.002506      0.4376    -0.64780
## 2500 meter buffer     -1.4930    -1.10300   -1.262000      0.9950    -1.82600
##                   cnd(ppl) cnd(rds) cnd(ssm_lns) cnd(trl) cnd(wll) df   logLik
## 500 meter buffer  -1.18100 -0.37350     -0.52020  -0.5640  0.49210 15 -313.151
## 1000 meter buffer  0.46020  0.66440      0.71130   0.6546 -1.01000 15 -313.579
## 750 meter buffer  -0.33060  0.39990      0.08933   0.1862 -0.04885 15 -313.611
## 1500 meter buffer  1.03500  0.96360      1.27100   1.6530 -0.36790 15 -314.163
## 4000 meter buffer -0.05392  0.81850      0.80620   1.1670 -0.69150 15 -315.277
## 4250 meter buffer  0.33400  1.50500      1.15300   1.4560  0.31490 15 -315.758
## 1250 meter buffer  0.95440  0.90300      0.84580   1.2300 -0.93390 15 -315.859
## 4500 meter buffer  1.63800  2.26200      1.74200   1.8280 -0.70290 15 -316.302
## 5000 meter buffer -1.26300  1.42800      0.56130   0.5805  0.78170 15 -317.172
## 4750 meter buffer -0.63540  1.47000      0.79400   0.7656  0.83780 15 -317.404
## 3750 meter buffer  0.57060  1.07800      1.07100   1.3880 -0.48320 15 -317.432
## 1750 meter buffer  0.40550  0.03219      0.92360   0.6467  1.10800 15 -317.524
## 3500 meter buffer  0.72160  1.32200      1.38500   1.4750  0.14660 15 -318.383
## 3250 meter buffer  0.18050  0.74830      1.18900   1.1820  1.57800 15 -318.817
## 3000 meter buffer  0.46020  0.92300      1.24900   1.3140  2.00700 15 -319.422
## 2750 meter buffer  0.87090  0.47250      0.88000   0.8917  1.42000 15 -320.792
## 2250 meter buffer  0.99960  0.23030      0.74480   0.8051  0.13300 15 -321.218
## 2000 meter buffer  0.76620  0.27650      0.88490   0.7060  0.53110 15 -321.295
## 2500 meter buffer  0.59280  0.25370      0.97210   1.0170  1.66700 15 -321.567
##                    AICc delta weight
## 500 meter buffer  659.8  0.00  0.331
## 1000 meter buffer 660.7  0.86  0.216
## 750 meter buffer  660.8  0.92  0.209
## 1500 meter buffer 661.9  2.02  0.120
## 4000 meter buffer 664.1  4.25  0.040
## 4250 meter buffer 665.0  5.21  0.024
## 1250 meter buffer 665.2  5.42  0.022
## 4500 meter buffer 666.1  6.30  0.014
## 5000 meter buffer 667.9  8.04  0.006
## 4750 meter buffer 668.3  8.51  0.005
## 3750 meter buffer 668.4  8.56  0.005
## 1750 meter buffer 668.6  8.75  0.004
## 3500 meter buffer 670.3 10.46  0.002
## 3250 meter buffer 671.2 11.33  0.001
## 3000 meter buffer 672.4 12.54  0.001
## 2750 meter buffer 675.1 15.28  0.000
## 2250 meter buffer 676.0 16.13  0.000
## 2000 meter buffer 676.1 16.29  0.000
## 2500 meter buffer 676.7 16.83  0.000
## Models ranked by AICc(x) 
## Random terms (all models): 
##   cond(1 | array)

this looks much more realistic

Now repeat for other species